perm filename AGENDA[AM,DBL]1 blob
sn#372412 filedate 1978-08-09 generic text, type T, neo UTF8
.NSECP(Control Structure)
.QQ
`Objectively' given, `important' problems may arise [in math].
But even then the mathematician
is essentially free to take it or leave it and turn to something else, while
an `important' problem in [any other science] is usually a conflict, a contradiction,
which `must' be resolved. The mathematician has a wide choice of which way to turn,
and he enjoys a very considerable freedom in what he does.
-- von Neumann
.ESS
.BEGIN TURN ON "{}"
.if false then start;
AM is one of those awkward programs whose representations only make
sense if you already understand how they will be operated on. A
discussion of AM's control structure (this chapter and the next) must
thus precede a discussion of concepts and how they are represented
(Chapter {[2] KNOWL}). Section {[2]EXAM1}.{[2]REPRSUM}
gave the reader a
sufficient knowledge of AM's "anatomy" to follow these chapters. Thus
armed with a cursory knowledge of the "statics" of AM, we shall
proceed to describe in detail its "dynamics".
.end;
Section {SECNUM}.1 will give the reader a feeling for the immensity
of AM's search space. This is the "problem". The next section will give the
top-level "solution": the flow of control is governed by a job-list,
an agenda of plausible tasks. Section {SECNUM}.3 will present some
details of this global control scheme.
Chapter {[2] HEURS} deals with the way AM's heuristics operate; this
could be viewed as the "low-level" or ⊗4local⊗* control structure of
AM.
.if false then start;
Chapter {[2] KNOWL} contains some detailed information about the
actual concepts (and heuristics) AM starts with, and a little more
about their design and representation.
The reader is also directed to
Appendix {[2] EXAM2}, which presents
several detailed examples of AM "in action".
.end;
.END
.AGENDA: SECNUM ;
.SSEC(AM's Search)
.QQ
To develop mathematics, one must always labor to substitute ideas for
calculations.
-- Dirichlet
.ESS
Let's first spend a paragraph reviewing how concepts are stored. AM
contains a collection of data structures, called ⊗4concepts⊗*. Each
concept is meant to coincide intuitively with one mathematical idea
(e.g., Sets, Union, Trichotomy). As such, a concept has several
aspects or parts, called ⊗4facets⊗* (e.g., Examples, Definitions,
Domain/range, Worth). If you wish to think of a concept as a
"frame", then its facets are "slots" to be filled in. Each facet of
a concept will either be totally blank, or else will contain a bunch
of ⊗4entries⊗*. For example, the Algorithms facet of the concept
Union may point to several equivalent LISP functions, each of which
can be used to form the union of two sets$$ The reasons for having
multiple algorithms is that sometimes AM will want one that is fast,
sometimes AM will be more concerned with economizing on storage,
sometimes AM will want to "analyze" an algorithm, and for that
purpose it must be a very un-optimized function, etc. $. Even the
"heuristic rules" are merely entries on the appropriate kind of facet
(e.g., the entries on the Interest facet of the Structure concept are
rules for judging the interestingness of Structures$$ A typical such
rule is: "A structure is very interesting if all its elements are
mildly interesting in precisely the same way." $).
At any moment, AM contains a couple hundred concepts, each of which
has only ⊗4some⊗* of its facets filled in. AM starts with 115
concepts, and grows to about 300 concepts before running out of
time/space. Most facets of most concepts are totally blank. AM's
basic activity is to select some facet of some concept, and then try
to fill in some entries for that slot$$ This is not quite complete.
In addition to filling in entries for a given facet/concept pair, AM
may wish to check it, split it up, reorganize it, etc. $. Thus the
primitive kind of "⊗4task⊗*" for AM is to deal with a particular
facet/concept pair. A typical task looks like this:
.B816
Check the entries on the "Domain/range" facet of the "Bag-Insert"
concept
.E
If the average concept has ten or twenty blank facets, and there are
a couple hundred concepts, then clearly there will be about
20x200=4000 "fill-in" type tasks for AM to work on, at any given
moment. If several hundred facets have recently been filled in,
there will be that many "check-entries" type tasks available.
Executing a task happens to take around ten or twenty cpu seconds, so over the course of
a few hours
only a small percentage of these tasks can ever be executed.$$ The precise
"18 seconds average"
figure is not important. All heuristic-search programs
suffer this same
handicap: As the depth to which they've searched increases, the percentage of
nodes (at or above that level) which have been examined ↓_decreases_↓
exponentially (assuming the branching factor b is strictly larger than unity).
$
Since most of these tasks will never be explored, what will make AM
appear smart -- or stupid -- are its choices of which task to pick at
each moment.$$
This is true of all heuristic search programs. The branchier the search,
the more it applies.
$ So it's worth AM's spending a nontrivial amount of time
deciding which task to execute next. On the other hand, it had better
not be ⊗4too⊗* much time, since a task does take only a dozen seconds.$$
The answer is that AM spends this "deciding" time not
just before a task is ⊗4picked⊗*, but rather each time a task is added to
the agenda. A little under 1 cpu second is spent, on the average, to place the
task properly on the agenda, to assign it a meaningful numeric priority value.
So "action time" is roughly one order of magnitude larger than "deciding time". $
One question that must be answered is: What percentage of AM's legal
moves (at any typical moment) would be considered intelligent
choices, and what percentage would be irrational? The answer comes
from empirical results. The percentages vary wildly depending on the
previous few tasks. Sometimes, AM will be obviously "in the middle"
of a sequence of tasks, and only one or two of the legal tasks would
seem plausible. Other times, AM has just completed an investigation
by running into dead-ends, and there may be hundreds of tasks it
could choose and not be criticized. The median case would perhaps
permit about 6 of the legal tasks to be judged reasonable.
It is important for AM to locate one of these currently-plausible
tasks, but it's not worth spending much time deciding which of
⊗4them⊗* to work on next. AM still faces a huge search: find one of
the 6 winners out of a few thousand candidates.
Its choice of tasks is made even more important due to the 10-second
"cycle time" -- the time to investigate/execute one task. A human
user is watching, and ten seconds is a nontrivial amount of time to
him. He can therefore observe, perceive, and analyze each and every
task that AM selects. Even just a few bizarre choices will greatly
lower his opinion of AM's intelligence. The trace of AM's actions is
what counts, not its final results. So AM can't draw much of its
apparent intelligence from the ⊗4speed⊗* of the computer.
Chess-playing programs have had to face the dilemma of the trade-off
between "intelligence" (foresight, inference, processing,...) and
total number of board situations examined. In chess, the
characteristics of current-day machines, language power ⊗4vs.⊗* speed,
and (to some extent) the limitations of our understanding of how to be
sophisticated,
have to date unfortunately
still favored fast, nearly-blind$$ i.e., using a very simple static
evaluation function. $ search. Although machine speed and LISP slowness
may allow
blind search to win over symbolic inference for ⊗4shallow⊗* searches,
it can't provide any more than a constant speed-up factor for an
exponential search. Inference is slowly gaining on brute force,$$ E.g.,
see [Berliner 74]. There, searching is used mainly to verify plausible
moves (a convergent process),
not to discover them (a bushier search).$ and must someday triumph.
Since the number of "legal moves" for AM at any moment is in the
thousands, it is unrealistic to consider "systematically"$$ e.g.,
exhaustively, or using αααβ minimaxing, etc. $ walking through the
entire space that AM can reach. In AM's problem domain, there is so
much "freedom" that symbolic inference finally ⊗4can⊗* win over the
"simple but fast" exploration strategy$$ This is the author's
opinion, partially supported by the results of AM. Paul Cohen
disagrees, feeling that machine speed should be the key to an
automated mathematician's success. $.
.SSEC(Constraining AM's Search)
.QQ
There exist too many combinations to consider all combinations of existing entities;
the creative mind must only ↓_propose_↓ those of potential interest.
-- Poincare'
.ESS
A great deal of heuristic knowledge is required to
constrain the necessary processing effectively, to zero in on a good task
to tackle
next. This is done in two stages.
.BN
λλ A list of plausible facet/concept pairs is maintained. Nothing can
get onto this list unless there is some reason why filling in (or checking) that
facet of that concept would be worthwhile.
λλ All the plausible tasks on this "job list" are ranked by the
number and strength of the different reasons supporting them. Thus
the facet/concept pairs near the top of the list will all be very
promising tasks to work on.
.E
The first of these constraints is akin to replacing a ⊗4legal⊗* move
generator by a ⊗4plausible⊗* move generator. The second kind of
constraint is akin to using a heuristic evaluation function to select
the best move from among the plausible ones.$$
Past AI programs (e.g., [Samuel 67]) have indicated that
constraining generation (1) is more important than sophisticated
ordering of the resultant candidates (2).
This was confirmed by the experiments performed on AM. $
The job-list or ⊗4agenda⊗* is a data structure which is a natural way
to store the results of these procedures. It is (1) a list of all
the plausible tasks which have been generated, and (2) it is kept
ordered by the numeric estimate of how worthwhile each task is. A
typical entry on the agenda might look like this:
.WBOX(10,22) ; TABS 12,50 TURN ON "\"
MBOX Fill in the EXAMPLES facet of the PRIMES concept $
MBOX ⊗8}⊗* $
MBOX ⊗8}⊗* $
MBOX ⊗8}⊗* ⊗4Reasons for filling in this facet⊗* $
MBOX ⊗8}⊗* $
MBOX ⊗8↓⊗* $
MBOX \⊗8⊂∞α→\⊃ $
MBOX \⊗8}⊗6 1. No examples of primes are known so far. \⊗8}⊗* $
MBOX \⊗8}⊗6 2. Focus of attention: AM just defined Primes. \⊗8}⊗* $
⊗8}\%∞α→\$∞ →}
MBOX ⊗8}⊗* $
MBOX ⊗8}⊗* $
MBOX ⊗8}⊗* ⊗4Overall value of these reasons⊗* $
MBOX ⊗8}⊗* $
MBOX ⊗8↓⊗* $
MBOX ⊗2250⊗* $
.EBOX
.COMMENT The funny line above is due to the fact that the MBOX separator is $ ;
.if false then start;
The actual top-level control structure is simply to pluck the top
task from the agenda and execute it. That is, select the
facet/concept pair having the best supporting reasons, and try to
fill in that facet of that concept.
.end;
While the top task is being executed, some new tasks might get
proposed and merged into the agenda. Also, some new concepts might
get created, and this, too, would generate a flurry of new tasks.
After AM stops filling in entries for the facet specified in the
chosen task, it removes that task from the agenda, and moves on to
work on whichever task is the highest-rated at that time.
.ONCE TURN ON "{}"
The reader probably has a dozen good
questions in mind
at this point (e.g., How do the reasons get rated?, How do the tasks
get proposed?, What happens after a task is selected?,...). The next
section should answer most of these. Some more judgmental ones (How
dare you propose a numeric calculus of plausible reasoning?! If you
slightly de-tune all those numbers, does the system's performance
fall apart?...) will be answered in Chapter {[2] EVALU}.
.SSEC(The Agenda)
.QQ
Creative energy is used mainly to ask the right question.
-- Halmos
.ESS
. SSSEC(Why an Agenda?)
.SSS1←SSSECNUM+1; ONCE TURN ON "{}";
This subsection provides motivation for the following one, by arguing
that a job-list scheme is a natural mechanism to use to manage the
task-selection problem AM faces.
.if false then start;
If that seems obvious to you, feel free to skip ahead to section
{SECNUM}.{SSECNUM}.{[2]AGENDASSSEC}, page {[3]AGENDAPAGE}.
.end;
Recall that AM must zero in on one
of the best few tasks to perform next, and it repeatedly makes this
choice. At each moment, there might be thousands of directions to
explore (plausible tasks to consider).
If all the legal tasks were written out, and reasons were thought up
to support each one, then perhaps we could order them by the strength
of those reasons, and thereby settle on the "best" task to work on
next. In order to appear "smart" to the human user, AM should
⊗4never⊗* execute a task having no reasons attached.
.ONCE TURN ON "{}"
Some magical function will be assumed to exist, which provides a
numeric rating, a priority value, for any given task. The function
looks at a given facet/concept pair, examines all the associated
reasons supporting that task, and computes an estimate of how worthwhile it would be
for AM to spend some time now working on that facet of that concept.
So AM will maintain a list of those legal tasks which have some good
reasons tacked onto them, which justify why each task should be
executed, why it is plausible. At least implicitly, AM has a numeric
rating for each task.
.if false then start;
The obvious control algorithm is to choose the
task with the highest rating, and work on that one next.
Assuming the tasks on this list are kept ordered by this numeric
rating, then AM can just repeatedly pluck the highest task and
execute it. While it's executing, some new tasks might get proposed
and added to the list of tasks. Reasons are kept tacked onto each
task on this list, and form the basis for the numeric priority
rating.
.end;
Give or take a few features, this notion of a "job-list" is the one
which AM uses. It is also called an ⊗4agenda⊗*.$$ Borrowed from
Kaplan's term for the job-list present in KRL (see
[Bobrow & Winograd 77]).
For an earlier general discussion of agendas, see [Knuth 68]. $
"A task on the agenda" is the same as "a job on the job-list" is the
same as "a facet/concept pair which has been proposed" is the same as
"an active node in the search space". Henceforth, I'll use the
following all interchangeably: task, facet/concept pair, node,
job.$$ Each of these terms will conjure up a
slightly different image: a "job" is something to do, a "node" is an
item in a search space, "facet/concept pair" reminds you of the
format of a task. $.
.if false then start;
The flavor of agenda-list used here is similar to the control structure of
HEARSAY-II [Lesser/Fennell/Erman/Reddy 75]. Vast numbers of tasks are proposed and added
to the job-list. Occasionally, when some new data arrives,
some task is repositioned.
.end;
. SSSEC(Details of the Agenda scheme)
.AGENDASEC: SECNUM
.AGENDASSEC: SSECNUM
.AGENDASSSEC: SSSECNUM;
.AGENDAPAGE:
At each moment, AM has many plausible tasks (hundreds or even thousands) which
have been suggested for some good reason or other, but haven't been
carried out yet. Each task is at the level of working on a certain
facet of a certain concept: filling it in, checking it, etc. Recall
that each task also has tacked onto it a list of symbolic reasons
explaining why the task is worth doing.
.XGENLINES←XGENLINES-1
.ONCE TURN ON "{}"
In addition, a number (between 0 and 1000) is attached ⊗4to each
reason⊗*, representing some absolute measure of the value of that
reason (at the moment).
One global formula$$ Here is that formula:
{TURN ON "↑↓" }
Worth(J) = ||SQRT(SUM R↓i↑2)|| x α[ 0.2xWorth(A) +
0.3xWorth(F) + 0.5xWorth(C)α], where J = job to be judged = (Act A,
Facet F, Concept C), and α{R↓iα} are the ratings of the reasons
supporting J. For the sample job pictured in the box below, A=Fillin,
F=Examples, C=Sets, α{R↓iα}=α{100,100,200α}. The formula will be repeated --
and explained -- in Section {[2]HEURS}.{[2]FORMULASSEC}, on
page
{[2]FORMULA}. $ combines all the reasons' values into a single priority
value for the task as a whole.
This overall rating is taken to indicate how worthwhile it would be for
AM to bother executing that task, how interesting the task
would probably turn out to be.
The "intelligence" of AM's selection of task is thus seen
to depend on this one formula.
Yet experiments
show
that its precise form is not important. We conclude
that the "intelligence" has been pushed down into the careful
assigning of reasons (and ⊗4their⊗* values) for each proposed task.
.COMMENT Beware of the braces in the last para.;
.XGENLINES←XGENLINES-1
A typical entry on the agenda might look like this:
.WBOX(9,11)
MBOX TASK: Fill-in examples of Sets $
MBOX PRIORITY: 300 $
MBOX REASONS: $
MBOX 100: No known examples for Sets so far. $
MBOX 100: Failed to fillin examples of Set-union, for lack of examples of Sets $
MBOX 200: Focus of attention: AM recently worked on the concept of Set-union $
.EBOX
Notice the similarity of this to the initial few lines which AM types
just after it selects a job to work on.
The priority value of a task serves a dual purpose: First, it
is used to determine which task on the agenda is the most promising
one to work on next. Second, once a task has been chosen, that
task's
priority value
then is used as an estimate of how much time and space it deserves. The
sample task above might rate 20 cpu seconds, and 200 list cells. When
either of these resources is used up, AM terminates work on the
task, and proceeds to pick a new one.
These two limits will be referred to in the sequel as "⊗4time/space quanta⊗*"
which are allocated to the chosen task.
Whenever several techniques exist for satisfying some task, the remaining
time/space quanta are divided evenly among those alternatives; i.e., each
method is tried for a small time. This policy of parceling out time and space quanta is
called "activation energy" in [Hewitt 76] and called
"resource-limited processes" in [Norman & Bobrow 75].
In the case of filling in
examples of sets, the space quantum (200 cells) will be used up quickly
(long before the 20 seconds expire).
There are two big questions now:
.BN
λλ Exactly how is a task proposed and ranked?
.INDENT 18,0,0
How is a plausible new task first formulated?
How do the supporting reasons for the task get assigned?
How does each reason get assigned an absolute numeric rating?
Does a task's priority value change? When and how?
.ONCE INDENT 4,0,0
λλ How does AM execute a task, once it's chosen?
Exactly what can be done during a task's execution?
.END
.ONCE TURN ON "{}α"
The next chapter will deal with both of these questions.